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1 Advanced Computer Architecture 


Many high-performance microprocessors support multithreading in hardware. 


(a) 


In coarse-grained multithreading, threads switch following specific events. 
(7) What hardware support is required for coarse-grained multithreading? 
[3 marks] 


(7) What hardware can be provided to reduce the cost of thread switching in 
coarse-grained multithreading and how does it help? [3 marks] 


In fine-grained multithreading, a new thread is selected to be fetched on each 
clock cycle. 


(1) How can fine-grained multithreading reduce the hardware requirements of 
a simple in-order processor in some circumstances? [3 marks] 


(7i) What is the impact on performance of fine-grained multithreading and how 
can it be improved? [3 marks] 


In simultaneous multithreading, threads co-exist within a core. 


(1) Describe a scenario where overall performance will improve, and another 
where it will get worse, with simultaneous multithreading. [4 marks] 


(ii) What factors need to be considered when deciding whether to duplicate, 
partition or share a core resource? [4 marks] 
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2 Bioinformatics 


(a) You are a data scientist working at a hospital. A former in-patient claims to 


have been infected with HIV during their previous stay in the hospital. You 
have access to blood samples of a number of patients who were hospitalised at 
the same time as the claimant. 


(i) Describe how you would investigate the claim. [5 marks] 
(17) Discuss how to evaluate the robustness of your finding. [5 marks] 


You are given a table of gene expression data for different patients and controls 
(healthy people). The rows represent different genes, columns represent a group 
of patients or controls. Discuss an algorithm to identify the most important 
genes involved in the disease, its complexity and limitations. [5 marks] 


A new deep-sea animal species is captured by a British expedition in the 
Mariana Trench, the deepest trench in the world, and a DNA sample is 
successfully sequenced. Species in extreme environment have usually adapted 
by accumulating large numbers of genomic rearrangements and mutations with 
respect to ancestor species. Discuss DNA coverage and reads length for robust 
genome assembly in the case the genome of the new species contains large 
number of repeated genomic regions. [5 marks] 
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3 Business Studies 


After completing your Computer Science degree, you and a small group of friends 
start a generative AI company to help lecturers produce better-designed slides and 
course material. 


You grow the company successfully and raise a number of funding rounds. 
Unfortunately you are not yet profitable, and the fund-raising environment changes 
just as you are about to start your next round. As a result, you decide to push it 
back by 6 to 9 months. 


(a) As the Head of Product Development for the company, how would you go about 
making your development budget last until the next fund-raising round. As part 
of your answer, consider the implications for your team. [8 marks] 


(b) Following the delayed fund-raising plan, the board have determined that the 
CEO has not been “consistently candid” with the board or customers, and 
needs to be replaced. You have agreed to become the new CEO. How would you 
manage the transition while minimising the negative impact on the company? 

[12 marks] 
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4 Cryptography 


(a) Consider a cyclic group (G,e) of order q with generator g. 


Briefly explain the difference between the Computational Diffie-Hellman 
problem and the Decision Diffie-Hellman problem for G, and state how if one 
of these problems is hard for G, what this implies for the other. [6 marks] 


While decompiling the executable of an ECDSA implementation with unknown 
domain parameters, you encounter a prime-number constant of the form 


Ox ffffffff ffffffff ffffffLfe 0626bd0c 2£33945b 7d67dbcb 


Based on the structure of its hexadecimal representation, what role could this 
number play? Explain your answer based on how elliptic-curve groups used in 
cryptography can be constructed. [6 marks] 


A certification authority C' would like to issue certificates that bind a user A’s 
public key Pk, to not just that user’s name, but to 10 different personal 
attribute values Ag,..., Ag, e.g. forename, surname, year of birth, birthday, 
gender, country, postcode, street address, email, portrait photo. User A can 
then use such a certificate to register with a range of different online services. 
However, not all attributes are required, or even appropriate, to be revealed 
to each service: some may only need the email address, whereas others need 
perhaps only forename, year of birth, gender, and the photo. 


User A should, therefore, be able to choose, which subset S C Zo of these 10 
attributes they want to reveal each time they present their certificate to a service. 
One solution would be that C signs for each user 2!° different certificates, each 
including a different subset of attributes. But that would be rather inefficient. 


Propose a certificate format, where C' generates just one digital signature for each 
user A, but A then can modify their certificate to remove any subset of the ten 
attribute values, such that the recipient still can be sure the received attribute 
values are authentic, while not being able to infer the value of the removed 
attributes (except with negligible probability in polynomial time). Explain in 
detail what A receives from the certification authority, and what A provides to 
a service that only needs a certificate covering attribute subset S. [8 marks] 


5 (TURN OVER) 


CST2.2024.9.6 


5 Denotational Semantics 


In all parts of this question, you are allowed to use theorems from the course, provided 
you state them precisely beforehand. You may also extend a proof by (rule) induction 
from the course with new cases without reproving the ones from the course, again 
provided you clearly state the proof you are extending. 


(a) Given domains D,, D2, FE, and £2, and continuous functions f,: D; > E; and 
fa: Do —> Fa, show that 


fi X fa: Dix Dy -> EF, x Eo 
(di,d2) ++ (fi(di), fo(d2)) 


is continuous. [6 marks] 


We wish to extend PCF with the product type 7 * 72, by adding the new terms fst, 
snd and pair to the language, such that 

AT pea Adee nite ag tty et I er Pitot 

[TF fst(t): 7% IF snd(t): 7) TF pair(ty, te): 7 * 72 


and with the following operational semantics: 


t WEA * TQ pair(v1, V2) t dbs * T2 pair(v, U2) ty da U1 tg es U2 
$St(t) pet snd(t) |,, v2 pair(t, te) Vrp*r pair(vi, v2) 


(b) Give a denotational semantics for the product type [7 * 72] in terms of [7,]] and 
[72]. [2 marks] 


(c) Give a denotational semantics for fst, snd and pair, extending the semantics of 
PCF from the lectures, and justify why this semantics is well-defined according 
to the typing rules given above. [6 marks] 


(d) Recall what it means for denotational semantics to be sound. Show that the 
semantics you have just given is sound. [6 marks] 
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6 Hoare Logic and Model Checking 


Consider the temporal logic CTL over atomic propositions p € AP: 
w € StateProp::= 1 | T | ad | di Ade |W Ve2/| Yi > 2|PIAGIES, 
@ € PathProp:=Xw|Fuwl|Gw| yy, U wy, 


(a) Specify the following properties as CTL formulae over AP = {p, q}. 


(i) Ifa state satisfying p cannot be reached, then q always holds. [3 marks] 


(77) From all reachable states, there is some path along which p holds, until it 
reaches a state from which no possible next state satisfies q. [3 marks] 


Consider a temporal model M over atomic propositions AP = {p,q,r,s}, with 
states {1,2,3,4,5}, initial state 1, and transitions and state labelling as shown 
in the diagram (e.g. in state 1, atomic propositions p and s hold). 


| () 


a{p,q} « 1{p, 8} — 2{p} 


ly | 


5{p} ————> a-{r} | 


Informally describe the meaning of each of the following CTL formulae over AP 
and explain whether or not they hold in the model. 


(7) A((q A s)U(EFr)) 2 marks 
(2) EG(p A AXp) 3 marks 


(1) Informally explain the difference in the properties that can be expressed by 
LTL and CTL. 3 marks 


(72) Consider the LTL formula ¢ = pU(Xq) and CTL formula wy = A(pU(AXq)), 
both over atomic propositions AP = {p,q}. Formally define a temporal 
model over AP that shows that @ and w are not equivalent. Explain why 
your temporal model satisfies one of the formulae but not the other. 

[6 marks] 
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7 Information Theory 


(a) 


(b) 


Draw a diagram that relates the mutual information between two random 
variables to their entropies, conditional entropies and joint entropy. [2 marks] 


In the analysis of continuous signals, explain why we often constrain the variance 
of the signal. What input distribution gives the maximum entropy under this 
constraint? [3 marks] 


What is the Gaussian channel? Why is it particularly relevant in the analysis 
of real world communications systems? [3 marks] 


Consider a Gaussian channel with input, output and noise represented by 
random variables X, Y and Z, such that Y = X + Z. State with justification, 
but without a detailed proof, the probability distribution of X that achieves the 
capacity. Derive an expression for this capacity. 


[Note: You may use the result that the entropy of a normally distributed random 
variable X ~ N(p, 07) is $log(27ec?) without proof. ] [7 marks] 


The Nyquist sampling theorem says that a signal with maximum frequency f 
must be sampled at no less than least 2f to allow reconstruction. Use this, 
together with your answer to (d), to derive the capacity of a Gaussian channel 
where the noise has bandwidth limited to B. [2 marks] 


Use your answer to (e) to explain how an ultra-wideband (UWB) communi- 
cations system (with bandwidths of multiple GHz) can avoid interference with 
other non-UWB users of the same part of the radio spectrum. [3 marks] 
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8 Machine Learning and Bayesian Inference 


(a) Consider the support vector machine (SVM) for inputs {(x;, y;) #1, (x,y) € 


X x {0,1}, ¥ C R*. Let (w,b) € R¢ x R denote the parameters which define 
the maximum-margin hyperplane returned by the SVM. 


(1) The SVM classification function is given by the maximum-margin hyper- 
plane: 


f(x) = sen((w, x) +b). 


Express w in terms of the dual variables {\;}"_, associated with the margin 
violation constraints. Hence rewrite f(x) in terms of A; and interpret how 
the hypothesis function classifies an unseen point x*. [4 marks] 


(7c) What property of the hypothesis function allows the extension of the SVM 
to define a nonlinear decision boundary in the feature space V? State the 
nonlinear extension of f(x) and explain how this may improve classification 
performance. [4 marks] 


(iii) Consider the SVM primal objective, where C' > 0 is a constant: 


i n 
argmin (Jie +C y ‘| 
"iS i=1 


Let Kj; = k(x;,x;) be the Gram matrix evaluated on training points, where 
kK: X xX X > Risa valid kernel function. Assume we have subsumed the 
bias b into the kernel. Write the kernelised SVM objective in terms of K,@ 
and €, where a; = \j4j. [3 marks] 


(iv) Rewrite the SVM objective in terms of the hypothesis function applied to 
each example, f = (f(x;),..., f(Xn))’. [Hint: Relate f anda] [3 marks] 


Consider modelling the data previously given as an underlying function 
contaminated with additive Gaussian noise, y; = f (x;)+€;, where €; ~ (0,07). 
Model the function f as a Gaussian process with zero mean, and covariance 
function K: ¥ x ¥ > R,ie. f ~ GP(O0, 4). 

[Hint: Recall for a d-dimensional normally distributed random variable 


z~ N(u,¥), p(z) = ((2m)4detD)~'” exp (-3(z — w)TZ(z - p)) | 


(i) Let f,y,X denote the collection of function values, training labels and 
feature vectors, respectively. Write down the log-prior, log p(f|X) in terms 
of the Gram matrix K;; = &(x;,x;). [2 marks] 


(i7) Find the posterior over f, p(f|X, y). Neglect the normalisation and compute 


the un-normalised log-posterior, neglecting constant terms. Compare 
against the SVM objective found in Part (a)(iv). [4 marks] 
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9 Optimising Compilers 


Compilers for functional languages sometimes perform strictness analysis to pass 
parameters by value rather than by name. 


(a) 
(b) 


Define the concept of strictness and how it differs from neededness. [3 marks] 
Write functions with two parameters for each of the following cases. 


(i) A function that is strict in both parameters but only the first is needed. 
[2 marks] 


(77) A function that is strict in its first parameter and only the first is needed. 
[2 marks] 


(iii) A function that is strict in neither parameter and neither is needed. 
[1 mark] 


Perform strictness analysis on the following program to obtain its strictness 
function. In which parameter(s) is f strict? 


f(a, b, c) 
= if a<i then b elif a<2 then c else f(a-c, b, c); 


You may use the following built-in strictness functions. 


0 
ae 
li(a,y) = xAy 
sub¥(a,y) = xAy 
cond‘(p,x,y) = pA(rVy) 


[6 marks] 


After strictness optimisation, some parameters remain passed by name, yet you 
wish to evaluate these as early as possible within the function whilst maintaining 
strictness properties. Describe an analysis to do this, and explain the results on 
the program in part (c). [6 marks] 
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10 Principles of Communications 


(a) The Internet is a shared resource. Users compete to send traffic, but need to 


cooperate to conserve resource. However user traffic has two fundamentally 
different utility curves, being elastic, or inelastic. 


In designing resource sharing schemes, two different fairness goals have been 
defined: proportional fairness versus max-min fairness. 


How do these goals reflect the traffic requirements of the two different utility 
curves? [10 marks] 


Routers can support different types of schedulers to provide fairness and isolation 
between traffic flowing between different sources and destinations. 


You have read about fair queueing, and hear someone has proposed a simpler 
scheme which could remove the requirement for per-flow state in the scheduler. 
The proposal is to use random scheduler. Would that be fair? What about 
isolation? 


What are the general considerations about traffic destined to be handled by such 
a scheduler, for it to work reasonably well? [10 marks] 
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11 Quantum Computing 


First : Ea 
register 0) | Ht | 


Second Pests es SSeS SSS SSE 
regi lu) 
gister 


The figure shows the circuit for quantum phase estimation of a Hadamard gate. 
What is the function of the sub-circuit shown in the box marked with the dashed 
line, and to how many bits of precision is the estimate of the phase given? 

[2 marks] 


1l/V¥2 1/vV2 ; . 
The Hadamard gate has matrix ee _ ear What are its eigenvectors 
and corresponding eigenvalues? Express each eigenvector as a quantum state 
(that is, as superposition of computational basis states). [5 marks] 


Simplify the circuit in the figure such that when the initial state of the first 
register is |00) as specified, the top wire only involves a swap gate and a 
measurement. [6 marks] 


Quantum phase estimation is performed using the circuit given in the figure with 


|u) = alO) + b|1). Express the three-qubit state |W) in terms of a and b. Verify 
that if |u) is a correctly normalised quantum state then so is |2). [7 marks] 
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Randomised Algorithms 


Given an undirected graph G = (V, £), an independent set is a subset J C V such 
that for any two vertices u € I,v € I, there is no edge {u,v} € E(G). Let a(G) 
denote the size of the largest independent set in G. 


(a) Consider the following randomised algorithm for computing an independent set, 


which takes as input an undirected graph G = (V, EF) and a fixed parameter 
p € (0, 1): 


Step 1: Starting with an empty set S, add each vertex from V(G) to S 
independently with probability p. 


Step 2: Go through all edges e = {u,v} € E(G), and for any edge e which 
had both vertices in S after Step 1, remove u and v from S. 


(1) Justify briefly why the output S' of this algorithm is an independent set. 
[2 marks] 


(7) Is the output S necessarily maximal, i.e., it is not possible to add any vertex 
u € V to S and obtain a larger independent set? Justify your answer. 
3 marks 


(iii) Prove that the expected size of the output S after the second step of the 
algorithm is p-|V| — p?- |E]. 4 marks 


(iv) How would you choose p in order to maximise the expected size of S, as 
computed in (a) (iii)? 4 marks 


(v) What does your answer in (a)(iv) imply for a(G)? Justify your answer. 
[3 marks] 


Formulate the problem of finding the largest independent set as an Integer 
Program (I), and describe the Linear Programming Relaxation (L). [4 marks] 
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13. Types 
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(a) Derive the following entailments with the natural deduction system for classical 
logic. 


(7) 
(vi) 


Show that AV B; AF Btrue. [5 marks] 


Show that AV B,7A;-+ Btrue. [7 marks] 


Consider System F extended with existential types, products, and a natural 
number type. 


(7) 


(v4) 


Give a Church encoding for an optional natural number type (corresponding 
to nat option in OCaml). [2 marks] 


Give an existential type corresponding to an abstract type of optional 
naturals, with constructors for Some and None, as well as a case analysis 
operation. It should correspond to the following OCaml module signature: 


module type ONAT = sig 

type t 

val none : t 

val some : nat -> t 

val case :t -> 'a -> (mat -> 'a) -> 'a 
end 


[3 marks] 


(it) Give an implementation of this existential type. [3 marks] 


END OF PAPER 
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